Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 





© 1997, 1996 Springer-Verlag New York, Inc. 

All rights reserved. This work may not be translated or copied in whole or in part without the 
written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New 
York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly 
analysis. Use in connection with any form of information storage and retrieval, electronic ad¬ 
aptation, computer software, or by similar or dissimilar methodology now known or hereafter 
developed is forbidden. 

The use of general descriptive names, trade names, trademarks, etc., in this publication, even if 
the former are not especially identified, is not to be taken as a sign that such names, as under¬ 
stood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by any- 


ISBN 0-387-98259-0 (hardcover) 
ISBN 0-387-98258-2 (softcover) 


SPIN 10629393 
SPIN 10794473 


Springer-Verlag New York Berlin Heidelberg 
A member of BertelsmannSpringer Science^Business Media GmbH 






Contents 


Preface to the Instructor 

Preface to the Student 

Acknowledgments 

Chapter 1 
Vector Spaces 


Complex Numbers 



Chapter 2 

Finite-Dimensional Vector Spaces 








Real Coefficients 


Exercises. 

Chapter 5 

Eigenvalues and Eigenvectors 

Invariant Subspaces. 

Polynomials Applied to Operators. 

Upper-Triangular Matrices . 

Diagonal Matrices. 

Invariant Subspaces on Real Vector Spaces .... 
Exercises. 

Chapter 6 

Inner-Product Spaces 

Inner Products. 

Norms . 

Orthonormal Bases. 

Orthogonal Projections and Minimization Problems 

Linear Functionals and Adjoints. 

Exercises. 

Chapter 7 

Operators on Inner-Product Spaces 

Self-Adjoint and Normal Operators. 

The Spectral Theorem. 

Normal Operators on Real Inner-Product Spaces . 

Positive Operators. 

Isometries. 

Polar and Singular-Value Decompositions. 

Exercises. 

Chapter 8 

Operators on Complex Vector Spaces 

Generalized Eigenvectors. 

The Characteristic Polynomial.. 

Decomposition of an Operator. 
























Contents 


vii 


Square Roots. 177 

The Minimal Polynomial. 179 

Jordan Form. 183 

Exercises. 188 

Chapter 9 

Operators on Real Vector Spaces 193 

Eigenvalues of Square Matrices. 194 

Block Upper-Triangular Matrices. 195 

The Characteristic Polynomial. 198 

Exercises. 210 

Chapter 10 

Trace and Determinant 213 

Change of Basis. 214 

Trace. 216 

Determinant of an Operator . 222 

Determinant of a Matrix. 225 

Volume. 236 

Exercises. 244 

Symbol Index 247 


Index 


249 












Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



卜 ace to 


Instructor 


e probably about to teach a course that will give students 


s determinant equals 0, and 
This tortuous (torturous?) 
igenvalues must exist, 
e proofs presented here of- 


d for suitable mathematical maturity. 




































f the eigenvalues and the determi- 
envalues (both counting multiplic- 


ppearance of the 


instead, t 




























linants only on complex vector spaces). 

Lt than teaching any particular set of theorems 
the ability to understand and manipulate the 
..Mathematics can be learned only by doing; 
，a has many good homework problems. When 














ce to the Student 



I would greatly appreciate hearing about any errors in this book, 
even minor ones. I welcome your suggestions for improvements, even 
tiny ones. 

Have fun! 


Sheldon Axler 

Mathematics Department 

San Francisco State University 

San Francisco, CA 94132, USA 

e-mail: axler@math.sfsu.edu 

wwwhome page: http://math.sfsu.edu/axler 
















J[cknow(edgments 


















Chapter 1 

Sector Spaces 


Linear algebra is the study of linear maps on finite-dimensional vec- 




eroots 


a square root of -1，denoted i, and manipulate it using the usual rules 
of arithmetic. Formally, a complex number is an ordered pair (a, b) f 
where a,b gR, but we will write this as a + hi. The set of all complex 
numbers is denoted by C: 

C = {a + bi:a f b g R}. 

If a e R, we identify a + Oi with the real number a. Thus we can think 
of R as a subset of C. 

Addition and multiplication on C are defined by 

(a + bi) + (c + di) = (a + c) + (& + d)i, 

{a + bi)(c + di) = (ac - bd) + {ad + bc)i\ 

here a, b,c,d e R. Using multiplication as defined above, you should 
verify that i 2 = -1. Do not memorize the formula for the product 
of two complex numbers; you can always rederive it by recalling that 
i 2 = -l and then using the usual rules of arithmetic. 

You should verify, using the familiar properties of the real num¬ 
bers, that addition and multiplication on C satisfy the following prop¬ 
erties: 

commutativity 

w z = z w and wz = Z'W for all w t z g C; 

associativity 

(zi + Z 2 ) + z 3 = zi + (Z 2 + z 3 ) and (ziz 2 )z 3 = 21 ( 22 : 3 ) for all 
Zi,Z 2j Z3 e C; 

identities 

z + 0 = z and zl = z for all z e C; 

additive inverse 

for every z g C, there exists a unique w g C such that z + w = 0; 

multiplicative inverse 

for every z g C with z ^ 0, there exists a unique w gC such that 
zmv = 1 ; 











Complex Numbers 


distributive property 

A(w + z) = Aw + Az for all \,w,z €： C. 

For z e C, we let -z denote the additive inverse of z. Thus -z is 
the unique complex number such that 

z + (-z) = 0. 

Subtraction on C is defined by 

w - z = w + (-z) 


for w f z g C. 

For z g C with z ^ 0, we let 1/z denote the multiplicative inverse 
of z. Thus 1 /z is the unique complex number such that 

z(l/z) = 1. 

Division on C is defined by 

w/z = w(l/z) 


for w f z € C with z ^ 0. 

So that we can conveniently make definitions and prove theorems 
that apply to both real and complex numbers, we adopt the following 
notation: 


Throughout this book, 

F stands for either R or C. 

Thus if we prove a theorem involving F, we will know that it holds when 

F is replaced with R and when F is replaced with C. Elements of F are 

called scalars. The word “scalar”，which means number, is often used 
when we want to emphasize that an object is a number, as opposed to 
a vector (vectors will be defined soon). 

For z € F and m a positive integer, we define z m to denote the 
product of z with itself m times: 

z m = z- ■■■ - z. 

m times 

Clearly (z m ) n = z mn and {wz) m = w m z m for all w,z e F and all 
positive integers m, n. 


The letter F is used 
because R and C are 
examples of what are 
called fields. In this 
book we will not need 
to deal with fields other 
than R or C. Many of 
the definitions, 
theorems, and proofs 
in linear algebra that 
work for both R and C 



change if an arbitrary 
field replaces R or C. 






Chapter 1. Vector Spaces 


"Definition of Vector Space 

Before defining what a vector space is, let’s look at two important 
examples. The vector space R 2 , which you can think of as a plane, 
consists of all ordered pairs of real numbers: 

R 2 = {(x,y) : x,y g R}. 

The vector space R 3 , which you can think of as ordinary space, consists 
of all ordered triples of real numbers: 

R 3 = {(x,y,z) tx,y,z e R}. 

To generalize R 2 and R 3 to higher dimensions, we first need to dis¬ 
cuss the concept of lists. Suppose n is a nonnegative integer. A list of 
length n is an ordered collection of n objects (which might be num¬ 
bers, other lists, or more abstract entities) separated by commas and 
Many mathematicians surrounded by parentheses. A list of length n looks like this: 
call a list of length n an 

n-tuple. (乂 1 ，…，乂 n) ■ 

Thus a list of length 2 is an ordered pair and a list of length 3 is an 
ordered triple. For j e {1，■ ■ ■ ， n}，we say that Xj is the j th coordinate 
of the list above. Thus X\ is called the first coordinate, X 2 is called the 
second coordinate, and so on. 

Sometimes we will use the word list without specifying its length. 
Remember, however, that by definition each list has a finite length that 
is a nonnegative integer, so that an object that looks like 

which might be said to have infinite length, is not a list. A list of length 
0 looks like this: ()■ We consider such an object to be a list so that 
some of our theorems will not have trivial exceptions. 

Two lists are equal if and only if they have the same length and 
the same coordinates in the same order. In other words, 
equals (: yi ， ■ ■ ■, y n ) if and only if m = n and Xi = yi”" ， x m = y m . 

Lists differ from sets in two ways: in lists, order matters and repeti¬ 
tions are allowed, whereas in sets, order and repetitions are irrelevant. 
For example, the lists (3, 5) and (5, 3) are not equal, but the sets {3, 5} 
and {5,3} are equal. The lists (4,4) and (4,4,4) are not equal (they 


























































Definition of Vector Space_9 


The motivation for the definition of a vector space comes from the 

important properties possessed by addition and scalar multiplication 

on F n . Specifically, addition on F n is commutative and associative and 
has an identity, namely, 0. Every element has an additive inverse. Scalar 
multiplication on F n is associative, and scalar multiplication by 1 acts 
as a multiplicative identity should. Finally, addition and scalar multi¬ 
plication on F n are connected by distributive properties. 

We will define a vector space to be a set V along with an addition 
and a scalar multiplication on V that satisfy the properties discussed 
in the previous paragraph. By an addition on V we mean a function 
that assigns an element u + v e V to each pair of elements u,v eV. 
By a scalar multiplication on V we mean a function that assigns an 
element av e V to each a and each v gV. 

Now we are ready to give the formal definition of a vector space. 
A vector space is a set V along with an addition on V and a scalar 
multiplication on V such that the following properties hold: 



u + v = v + u for all ti, v e V; 

associativity 

(u + v) -\-w = u +(v -\-w) and {ab)v = a(bv) for allti, v, w e V 
and all a,b gT] 

additive identity 

there exists an element 0 e V such that v + 0 = v for all v e V; 

additive inverse 

for every v e V, there exists w gV such that v + w = 0 m , 

multiplicative identity 

Iv = v for all v e V; 

distributive properties 

a(u + v) = au + av and (a + b)u = au + bu for all a,b and 
all ii, v e V. 

The scalar multiplication in a vector space depends upon F. Thus 
when we need to be precise, we will say that V is a vector space over F 



































Subspaces 


SuBspaces 

A subset [7 of V is called a subspace of V if [/ is also a vector space 
(using the same addition and scalar multiplication as on V). For exam 
pie, 

i(Xi,X2,0) : Xi,X2 e F} 

is a subspace of F 3 . 

If [/ is a subset of V, then to check that [/ is a subspace of V we 
need only check that U satisfies the following: 

additive identity 

OgU 

closed under addition 

u,v g U implies + v e [/; 

closed under scalar multiplication 

a e F and u gU implies au g U. 

The first condition insures that the additive identity of V is in U. The Clearly {0} is the 
second condition insures that addition makes sense on U. The third smallest subspace of V 
condition insures that scalar multiplication makes sense onU. To show and V itself is the 



use the term linear 
subspace, which means 
the same as subspace. 




































completing t 




implies that 








additive 


Prove or 
such thal 

then Ui = 
14. Suppose 


^ or give a counterexample: if Ui, U 2 ,W are subspaces of V 
that 

V = tA ㊉ W and V = [/ 2 ® W ， 

Ui = U 2 . 







Chapter 2 

Jinite-Vimemionaf 
yector Spaces 


In the last chapter we learned about vector spaces. Linear algebra 
focuses not on arbitrary vector spaces, but on finite-dimensional vector 
spaces, which we introduce in this chapter. Here we will deal with the 
key concepts associated with these spaces: span, linear independence, 
basis, and dimension. 

Let’s review our standing assumptions: 

Recall that F denotes R or C. 

Recall also that V is a vector space over F. 




21 





Chapter 2. Finite-Dimensional Vector Spaces 


Span and Linear Independence 

A linear combination of a list (v\ , ■", v m ) of vectors in V is a vector 
of the form 

2-1 CilVl + ■ ■ ■ + CL m V mf 

where ai,..., a m g F. The set of all linear combinations of (vi, ■ ■ ■ ， v m ) 








e are slightly abusing notation by letting z K denote 
dummy variable). 

lat is not finite dimensional is called infinite di- 
nDle. T(¥) is infinite dimensional. To Drove this. 







































construction. 

,then U is finite dimensional and we are done. If U ^ 
: hoose a nonzero vector v\ g U. 

， ■ ■ ■ ， Vj_i), then U is finite dimensional and we are 












27 



i constructed 


ana uius Liiti process iiiusl evemudiiy lerimiicue, wmtn mcdiis nidi u 
is finite dimensional. ■ 

"Bases 

A basis of V is a list of vectors in V that is linearly independent and 
spans V. For example, 


the 


z.o Proposition: A list (Vi,..., v n ) ol vectors mV is a basis ol V 
if and only if every v e V can be written uniquely in the form 

2.9 v = aiV\ + ■ ■ ■ + a n v n , 

where ai,...,a n g F. 

Proof: First suppose that (vi,..., v n ) is a basis of V. Let v e V. This p 
Because (Vi,..., v n ) spans V, there exist ai,...,a n e F such that 2.9 essent 
holds. To show that the representation in 2.9 is unique, suppose that of the 
bi,...,b n are scalars so that we also have to the 

linear 





























dimensional. 











Vimensum 



































[nation of the basis vect 
linear combinations int 
i linear combination of 
alars used in this linear ( 





Suppose m is a positive integer. Is the set consisting of 0 and 
polynomials with coefficients in F and with degree equal to r\ 
subspace of T(¥)7 

Prove that F 00 is infinite dimensional. 

Prove that the real vector space consisting of all continuous re 
valued functions on the interval [0,1] is infinite dimensional. 






11. Suppose that 
such that dim 

12. Suppose that 
pj(2) = 0 for 



16. Prove that if V is finite dimensional 
of V, then 

dim([/i + … + U m ) < din 

17. Suppose V is finite dimensional. 









Let’s agree that for the rest of this chapter 
W will denote a vector space over F. 

* ❖ * 






Vefmitians and Examples 


Some mathematicians 
use the term linear 
transformation, which 
means the same as 
linear map. 


A linear map from V to VK is a function T:V — W with the following 
properties: 

additivity 

T(u + v) = Tu + Tv for all ii, v e V; 

homogeneity 

T(av) = a(Tv) for all a e F and all v e V. 


Note that for linear maps we often use the notation Tv as well as the 
more standard functional notation T(v). 

The set of all linear maps from V to 14^ is denoted £(V, W). Let’s 
look at some examples of linear maps. Make sure you verify that each 
of the functions defined below is indeed a linear map: 


zero 

In addition to its other uses, we let the symbol 0 denote the func¬ 
tion that takes each element of some vector space to the additive 
identity of another vector space. To be specific, 0 e X(V, W) is 
defined by 

Ov = 0. 

Note that the 0 on the left side of the equation above is a function 
from V to W t whereas the 0 on the right side is the additive iden¬ 
tity in W. As usual, the context should allow you to distinguish 
between the many uses of the symbol 0. 

identity 

The identity map, denoted /, is the function on some vector space 
that takes each element to itself. To be specific, I e £(V,V) is 
defined by 


Iv = v. 









mathematicians 









Null Spaces and Ranges_41 


write ST instead of 5 o T. You should verify that ST is indeed a linear 
map from U to W whenever T e £(U,V) and 5 e £(V, W). Note that 
•ST is defined only when T maps into the domain of S. We often call 

ST the product of S and T. You should verify that it has most of the 

usual properties expected of a product: 

associativity 

( 乃 ) 乃 = ( 乃乃 ）whenever ，乃 ， and 乃 are linear maps such 

that the products make sense (meaning that must map into the 
domain of Ti' and 乃 must map into the domain of Ti). 

identity 

TI = T and IT = T whenever T e £(V, W) (note that in the first 
equation / is the identity map on V, and in the second equation I 
is the identity map on W). 

distributive properties 

(5i + S 2 )T = SiT + 52T and S(Ti + T 2 ) = ST\ + ST 2 whenever 
T^TtmLiJJ^V) and 5,5i, 5 2 e£(V, W). 


Multiplication of linear maps is not commutative. In other words, it 









now suppose that 
o do this, suppose 
























Proof: Suppose 
that dimV > dimW. 


























fa Linear May) 

if (Vi,..., v n ) is a basis of V 
> of Tvi, ■■■, 7V n determine tl 
In this section we will see how 
of recording the values of the 























?ar Maps 


B.9 


3 


bi, n 


^m,n 

+ ■ 

■ ■ a\ t n + bi t n 

+ ^m,l ■ 

■- ^m,n + 


ion of matrix addition, 


Still assuming that we have some bases in mind, is the matrix of a 
scalar times a linear map equal to the scalar times the matrix of the 
linear map? Again the question does not make sense because we have 
not defined scalar multiplication on matrices. Fortunately the obvious 
definition again has the right properties. Specifically, we define the 
product of a scalar and a matrix by multiplying each entry in the matrix 
by the scalar: 


^-1,1 ■ 

■ ^l,n 


" cai t i . 

■ CUi t n 

_ ^m,l ■ 

■ ^m,n _ 


CUni,l ■ 

■ CUni t n _ 


You should verify that with this definition of scalar multiplication on 
matrices, 

B.10 M{cT) = cM{T) 

whenever c e F and T g £(V, W). 

Because addition and scalar multiplication have now been defined 
for matrices, you should not be surprised that a vector space is about 
to appear. We need only a bit of notation so that this new vector space 












_The Matrix of a Linear Map_ 

is a basis of U. Consider linear maps S:U ^ V and T:V^W. The 
composition TS is a linear map from U to W. How can M(TS) be 
computed from M(T) and M(S)7 The nicest solution to this question 
would be to have the following pretty relationship: 


B.l 


M(TS) =M(T)M(S). 


So far, however, the right side of this equation does not make sense 
because we have not yet defined the product of two matrices. We will 
choose a definition of matrix multiplication that forces the equation 
above to hold. Let’s see how to do this. 

Let 


M(T)= 


ai，i ■ 

■ ^l,n 


^1,1 - 

■办 l，p 



and M(S)= 



_ ^m,l ■ 

■ a*m，n _ 


- 办 n，l ■ 

■ b n ,p _ 


For fee {1 ， … ， p}，we have 


TSu k = ： T(Z br,kVr) 

r=l 

n 

=X b r,kTv r 

r=l 

n m 

=S b r,k X a hr^j 

r=l j=l 

m n 

= aj, r b r ,k)-Wj- 


Thus M(TS) is the m-by-p matrix whose entry in row j, column k 
equals Zr=i aj,rb r ,k- 

Nowit’s clear how to define matrix multiplication so that 3.11 holds. 
Namely, if A is an m-by-n matrix with entries aj t k and E is an n-by-p 
matrix with entries bj’k ，then AB is defined to be the m-by-p matrix 
whose entry in row j, column k, equals 

n 

X a j,rb r ,k- 


You probably learned 
this definition of matrix 
multiplication in an 
earlier course, although 
you may not have seen 
this motivation for it. 


















Invertibility 


53 


Proof: Let 

Cll,l d\,n 

3.15 M{T)= : 

_ ^m,l ■■- ^m,n 

This means, we recall, that 

m 

3.16 Tvjc = I aj，kWj 

j=i 

for each k. Let v be an arbitrary vector in V, which we can write in the 
form 3.12. Thus M{v) is given by 3.13. Now 

Tv = biTvi + ■ ■ ■ + b n Tv n 

m m 

= biX a J_，W + ---+KX a hn^j 

J=1 J=1 

m 

= + ■ ■ ■ + 

i=i 

where the first equality comes from 3.12 and the second equality comes 
from 3.16. The last equation shows that M{Tv), the m-by-1 matrix of 
the vector Tv with respect to the basis (Wi, ■ ■., w m ), is given by the 
equation 

^1,1^1 + ■ ■ ■ + Ui f nb n 
M(Tv)= : . 

_ CLm, i^i + ■ ■ ■ + a m ^ n b n 

This formula, along with the formulas 3.15 and 3.13 and the definition 
of matrix multiplication, shows that M(Tv) = ■ 



A linear map T g £(V, W) is called invertible if there exists a linear 
map 5 e £(W,V) such that ST equals the identity map on V and TS 
equals the identity map on W. A linear map 5 e £(W,V) satisfying 
ST = I and TS = I is called an inverse of T (note that the first I is the 
identity map on V and the second I is the identity map on W). 

If 5 and 5' are inverses of T, then 




5.\ / Proposition: Am 

tive and surjective. 

Proof: Suppose T e . 


Now suppose that : 
that T is invertible. Fc 















these vector spaces is 

isomorphic to some di,i ... ^i,n 

F m , thinking of them A = : : 

that way often adds 

7 、 u ‘ L ^m,l … ^m,n _ 

complexity but no new 

insight. be a matrix in Mat(m, n, F). Let T be the linear map from V to W such 
that 

m 

Tv k = ^ aj' k Wj 


for k = 1,...,n. Obviously M(T) equals A, and so the range of M 
equals Mat(m,n,F), as desired. ■ 

























Suppose tl 
there exist 


null T = [/ if a 

14. Suppose that 
that T is injec 
that 5 了 is the 

15. Suppose that 
that T is surje 
that T5 is the 

16. Suppose that i 

















re equivalent: 


the only 





Chapter 4 

To^ynomials 


This short chapter contains no linear algebra. It does contain the 
background material on polynomials that we will need in our study 
of linear maps from a vector space to itself. Many of the results in 
this chapter will already be familiar to you from other courses; they 
are included here for completeness. Because this chapter is not about 
linear algebra, your instructor may go through it rapidly. You may not 
be asked to scrutinize all the proofs. Make sure, however, that you 
at least read and understand the statements of all the results in this 
chapter—they will be used in the rest of the book. 


Recall that F denotes R or C. 


❖ 

* * * 


63 















Corollary: Suppose 


UO + U\Z - 

all z e F, then ao = ••■ 

Proof: Suppose ao + ai 
4.3, no nonnegative int( 
is all the coefficients eqi 

The corollary above imp 


ice implies that each polynomial 
a linear combination of functior 













omj)(ex Coefficients 

So far we have been handling polynomials with complex coefficients 

rl nnlvnnmials with rpal rnpffiripnts simiiltanpnnslv thrnncrh rmr rnn- 
































Real Coefficients 


69 


KeaC Coefficients 

Before discussing polynomials with real coefficients, we need to 
earn a bit more about the complex numbers. 

Suppose z = a + hi, where a and b are real numbers. Then a is 
: ailed the real part of z, denoted Rez, and b is called the imaginary 

part of z, denoted Imz. Thus for every complex number z, we have 

z = Re z + (Imz)i. 

The complex conjugate of z e C, denoted z, is defined by 
z = Rez - (Imz)i. 

For example, 2 + 3i = 2 - 3i. 

The absolute value of a complex number z, denoted |z|, is defined 

by _ 

\z\ = ^(Rez) 2 + (Imz) 2 . 

For example, |1 + 2i| = V5. Note that |z| is always a nonnegative 
number. 

You should verify that the real and imaginary parts, absolute value, 
and complex conjugate have the following properties: 

additivity of real part 

Re(w + z) = Re w + Re z for all w, z e C; 

additivity of imaginary part 

Im(w + z) = Imw + Imz for all w,z e C; 

sum of z and z 

z + z = 2 Re z for all z e C; 

difference of z and z 

z- z = 2(Imz)i for all z e C; 

product of z and z 

zz = \z\ 2 for all z g C; 

additivity of complex conjugate 

勢 ■z = w + z for all w,z e C; 


dty of complex conjugate 
































Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



Chapter 5 

Xigenvahes and Eigenvectors 


In Chapter 3 we studied linear maps from one vector space to an¬ 
other vector space. Now we begin our investigation of linear maps from 
a vector space to itself. Their study constitutes the deepest and most 
important part of linear algebra. Most of the key results in this area 

do not hold for infinite-dimensional vector spaces, so we work only on 

finite-dimensional vector spaces. To avoid trivialities we also want to 
eliminate the vector space {0} from consideration. Thus we make the 
following assumption: 

Recall that F denotes R or C. 

Let’s agree that for the rest of the book 
V will denote a finite-dimensional, nonzero vector space over F. 


❖ ❖ 

* ❖ * 

75 










Chapter 5. Eigenvalues and Eigenvectors 


Invariant SuBspaces 
















regrettable word 



'f-English. The 
rman adjective eigen 
、ans own in the sense 
characterizing some 
rinsic property, 
ue mathematicians 
? the term 
iracteristic value 


eigenvalue. 
























the linear dependence 







80 


Chapter 5. Eigenvalues and Eigenvectors 


Tolynormals J^rpCiecCto Operators 

The main reason that a richer theory exists for operators (which 
map a vector space into itself) than for linear maps is that operators 
can be raised to powers. In this section we define that notion and the 
key concept of applying a polynomial to an operator. 

If T e £(V) f then TT makes sense and is also in £(V). We usually 
write T 2 instead of TT. More generally, if m is a positive integer, then 
T m is defined by 

T m = T T 
^~ 

m times 

For convenience we define T° to be the identity operator / on V. 

Recall from Chapter 3 that if T is an invertible operator, then the 
inverse of T is denoted by T _1 . If m is a positive integer, then we define 
T~ m to be (T- 1 ) 171 . 

You should verify that if T is an operator, then 

j'mj'n _ j'm+n = Y mn 

where m and n are allowed to be arbitrary integers if T is invertible 
and nonnegative integers if T is not invertible. 

If T e £(V) and p g T(F) is a polynomial given by 

p(z) = ao + a\z + a 2 Z 2 + … + a m z m 

for z e F, then p(T) is the operator defined by 

p(T) = aol + a\T + a 2 T 2 + ■ ■ ■ + a m T m . 

For example, if p is the polynomial defined by p (z) = z 2 for z e F, then 
p(T) = T 2 . This is a new use of the symbol p because we are applying 
it to operators, not just elements of F. If we fix an operator T e £(V), 
then the function from T(F) to £(V) given by p p(T) is linear, as 
you should verify. 

If p and q are polynomials with coefficients in F, then pq is the 
polynomial defined by 


(pq)(z) = p(z)q(z) 


for z You should verify that we have the following nice multiplica¬ 
tive property: if T e L{V), then 






Upper-Triangular Matrices 


81 


(pq)(T) = p(T)q(T) 

for all polynomials p and q with coefficients in F. Note that any two 
polynomials in T commute, meaning that p(T)q(T) = q(T)p(T), be¬ 
cause 

p(T)q(T) = (pq)(T) = (qp){T) = q(T)p(T). 


Vpj^er-THangular Matrices 



note it by M(T, (vi,..., 
is clear from the context 


If 了 is an operator on 



is 1 in the j th slot and 0 
the 产 column of M(T) 


A central goal of line 
T g £(V) f there exists 
reasonably simple matm 

















suppose 






















as can be seen from 5.17, T maps each of the vectors Vi,..., Vk-i into 
span(vi,... ,Vfe-i). Because Ak = 0, the matrix representation 5.17 also 
implies that Tvk ^ span(vi ， … ， Vk-i). Thus we can define a linear map 


S: span(vi,...,Vfe) - span(vi,...,Vfe_i) 








86 


Chapter 5. Eigenvalues and Eigenvectors 


by Sv = Tv for v e span(vi,...,Vfe). In other words, 5 is just T 
restricted to span(vi,. ..,Vfe). 

Note that span(vi, … ， Vk) has dimension k and span(vi, … ， Vk-i) 
has dimension fe - 1 (because (vi,..., v n ) is linearly independent). Be¬ 
cause span(vi, … ， Vk) has a larger dimension than span(vi, … ， Vk-i), 
no linear map from span(vi,... ,Vfe) to span(vi, … ， Vk-i) is injective 
(see 3.5). Thus there exists a nonzero vector v g span(vi,.. .,Vk) such 
that Sv = 0. Hence Tv = 0, and thus T is not invertible, as desired. 

To prove the other direction, now suppose that T is not invertible. 
Thus T is not injective (see 3.21), and hence there exists a nonzero 
vector v g V such that Tv = 0. Because (vi,..., v n ) is a basis of V, we 
can write 


v = + ■ ■ ■ + a k v k ' 

where ai,...,afe e F and ak ^ 0 (represent v as a linear combination 
of (vi,..., v n ) and then choose k to be the largest index with a nonzero 
coefficient). Thus 

0 = Tv 

0 = T(aiVi + … + aicVk) 

=(aiTvi + ■ ■ ■ + a k -iTv k -i) + a k Tv k . 

The last term in parentheses is in span(vi,..., Vfe_i) (because of the 
upper-triangular form of 5.17). Thus the last equation shows that 
akTvk g span(vi，■ ■ ■ ， Vfc—i). Multiplying by 1/afe, which is allowed 
because ah ^ 0, we conclude that Tvk e span(vi,...,Vfe_i). Thus 
when Tvk is written as a linear combination of the basis (vi,..., v n ), 
the coefficient of 外 will be 0. In other words, Afc in 5.17 must be 0, 
completing the proof. ■ 


Powerful numeric 
techniques exist for 
finding good 
approximations to the 
eigenvalues of an 
operator from its 
matrix. 


Unfortunately no method exists for exactly computing the eigenval- 
ues of a typical operator from its matrix (with respect to an arbitrary 
basis). However, if we are fortunate enough to find a basis with re¬ 
spect to which the matrix of the operator is upper triangular, then the 
problem of computing the eigenvalues becomes trivial, as the following 
proposition shows. 

5.18 Proposition: Suppose T g £(V) has an upper-triangular matrix 
with respect to some basis of V. Then the eigenvalues of T consist 
precisely of the entries on the diagonal of that upper-triangular matrix. 





is a diagonal matrix. Obviously ( 
lar, although in general a diago] 
upper-triangular matrix. 

An operator T g L{V) has a 


















actors, T has 

■ 

al conditions 


.,v n ) consisting of 






















Invariant Subspaces on Real Vector Spaces 


91 


Invariant SuBspaces on "Reafyector Spaces 

We know that every operator on a complex vector space has an eigen¬ 
value (see 5.10 for the precise statement). We have also seen an example 
showing that the analogous statement is false on real vector spaces. In 
other words, an operator on a nonzero real vector space may have no 
invariant subspaces of dimension 1. However, we now show that an 
invariant subspace of dimension 1 or 2 always exists. 

5.24 Theorem: Every operator on a Unite-dimensional ， nonzero, real 
vector space has an invariant subspace of dimension 1 or 2. 

Proof: Suppose V is a real vector space with dimension n > 0 and 
T g £(V). Choose v e V with v ^ 0. Then 

(v,Tv,T 2 v,...,T n v) 

cannot be linearly independent because V has dimension n and we have 
n + 1 vectors. Thus there exist real numbers ao,..., a n , not all 0, such 
that 

0 = aov + a\Tv + ■ ■ ■ + a n T n v. 

Make the a's the coefficients of a polynomial, which can be written in 
factored form (see 4.14) as 

ao + a\X + ■ ■ ■ + a n x n 

=c(x - Ai)... (x - A m )(x 2 + aix + Pi)...(x 2 + a M x + ^ M ), Here either m 

might equal 0. 

where c is a nonzero real number, eachAj, aj, and is real, m+M > 1, 
and the equation holds for all x eR. We then have 

0 = a-ov + a\Tv + ■ ■ ■ + a n T n v 
=(aol + a\T + ■ ■ ■ + a n T n )v 

Ai7) ...(T- \ m I)(T 2 + «iT + )Si7) ...(T 2 + a M T + p M I)v, 


which means that T - \jl is not injective for at least one j or that 
(T 2 + ajT + Pjl) is not injective for at least one j. If T - A〆 is not 
injective for at least one j, then T has an eigenvalue and hence a one¬ 
dimensional invariant subspace. Let’s consider the other possibility. In 
other words, suppose that (T 2 + ajT + Pjl) is not injective for some j. 
Thus there exists a nonzero vector u such that 






















1. Suppose T 
invariant u] 

2. Suppose T 


1 that is 
=V. 

Dve that 


Define T g £(F 3 ) by 

r(zi,z 2 ,z ： 

Find all eigenvalues and eige: 

Suppose n is a positive integ 

T{xi,...,x n ) = (xi + 

in other words, T is the opei 
the standard basis) consists 
eigenvectors of T. 

Find all eigenvalues and eige 
erator T e X(F°°) defined by 

T(Zi,Z 2 ,Z 3 , 

Suppose T e £(V) and dim 
most k + 1 distinct eigenvalu 

Suppose T e £(V) is invertil 
















Chapter 6 

Inner-Troduct Spaces 


In making the definition of a vector space, we generalized the lin¬ 
ear structure (addition and scalar multiplication) of R 2 and R 3 . We 
ignored other important features, such as the notions of length and 
angle. These ideas are embedded in the concept we now investigate, 
inner products. 


Recall that F denotes R or C. 

Also, V is a finite-dimensional, nonzero vector space over F. 


❖ 

❖ ❖ 
* ❖ * 


97 





98 


Chapter 6. Inner-Product Spaces 


Inner TrocCucts 

To motivate the concept of inner product, let’s think of vectors in R 2 
and R 3 as arrows with initial point at the origin. The length of a vec¬ 
tor x in R 2 or R 3 is called the norm of x, denoted ||x||. Thus for 
If we think of vectors x = (xi,X2) ^ R 2 , we have ||x|| = V^i 2 + ^ 2 2 - 


as points instead of 
arrows, then \\x\\ 

x 2 -axis 


should be interpreted 



as the distance from 



the point x to the 
origin. 

/ 

卢 x 2 ) 


x^axis 


The length of this vector x is V^i 2 + ^2 2 - 

Similarly, for x = {xi,X2,x^) g R 3 , we have \\x\\ = V^i 2 +X2 2 +X3 2 . 
Even though we cannot draw pictures in higher dimensions, the gener¬ 
alization to R n is obvious: we define the norm of x = (xi, ■ ■ ■, x n ) e R n 
by _ 

\\X\\^X! 2 + … +X n 2 . 

The norm is not linear on R' To inject linearity into the discussion, 
we introduce the dot product. For x f y € R n , the dot product of x 
and y, denoted x • y,is defined by 

x • y = xiyi + ■ ■ ■ + x n y n , 

where x = (xi,...,x n ) and y = (yi,... ,y n )- Note that the dot product 
of two vectors in R n is a number, not a vector. Obviously x ■ x = ||x|| 2 
for all x e R n . In particular, x ■ x > 0 for all x e R n , with equality if 
and only if x = 0. Also, if y e R n is fixed, then clearly the map from R n 
to R that sends x e R n to x ■ y is linear. Furthermore, x • y = y • x 
for all x, y e R n . 

An inner product is a generalization of the dot product. At this 
point you should be tempted to guess that an inner product is defined 













100 


Inner-Product Spaces 


If z is a complex 
number, then the 
statement z >0 means 
that z is real and 
nonnegative. 



(v,v) >0 for all v e V; 

definiteness 

(v, v) = 0 if and only if v = 0; 

additivity in first slot 

(u + v,w) = (u,w) + {v,w) for all u,v,w e V; 


homogeneity in first slot 

(av,w) = a(v f w) for all a g F and all v f w € V; 


conjugate symmetry 

{v,w) = (W,v) for all v,w eV. 


Recall that every real number equals its complex conjugate. Thus 
if we are dealing with a real vector space, then in the last condition 
above we can dispense with the complex conjugate and simply state 
that (v,w) = {w,v) for all v,w € ： V. 

An inner-product space is a vector space V along with an inner 
product on V. 

The most important example of an inner-product space is F n . We 










orthogonal. to every vector. Furthermore, 0 is the only vector that is orthogonal to 


itself. 

For the special case where V = R 2 , the next theorem is over 2,500 
years old. 


The word orthogonal 
comes from the Greek 
word orthogonios, 


6.B Pythagorean Theorem: If u f v are orthogonal vectors in V， then 
6.4 \\u + v|| 2 = ||w|| 2 + ||v|| 2 . 





= (u,v) -a||v|| 2 . 

e should choose a to be <ti, v)/||v|| 2 






1 decomposition 




Multiplying both sides of this inequality by ||v|| 2 and then taking square 















Norms 


105 


6.9 Triangle Inequality: If u,v e V f then 

6.10 \\u + v|| < ||w|| + ||v||. 

This inequality is an equality if and only if one of u, v is a nonnegative 
multiple of the other. 

Proof: Let u,v gV. Then 

\\u + v\\ 2 = {u + V ,u + v) 

=(U,u) + (v,v) + (u,v) + (v,u) 

=(u,u) + (v,v> + (u,v) 

=llwll 2 + llvll 2 + 2Re(w,v> 

#|w|| 2 + llv|| 2 + 2|(w,v>| 

< llw|| 2 +llv|| 2 + 2||w|| ||v|| 

= (llw|| +llvll) 2 , 

where 6.12 follows from the Cauchy-Schwarz inequality (6.6). Taking 
square roots of both sides of the inequality above gives the triangle 
inequality 6.10. 

The proof above shows that the triangle inequality 6.10 is an equality 
if and only if we have equality in 6.11 and 6.12. Thus we have equality 
in the triangle inequality 6.10 if and only if 

6.1 B (u,v) = ll-ullllvll. 

If one of v is a nonnegative multiple of the other, then 6.13 holds, as 
you should verify. Conversely, suppose 6.13 holds. Then the condition 
for equality in the Cauchy-Schwarz inequality (6.6) implies that one of 
u,v must be a scalar multiple of the other. Clearly 6.13 forces the 
scalar in question to be nonnegative, as desired. ■ 


6.11 

6.12 


The next result is called the parallelogram equality because of its 
geometric interpretation: in any parallelogram, the sum of the squares 
of the lengths of the diagonals equals the sum of the squares of the 
lengths of the four sides. 


The triangle inequality 
can be used to show 
that the shortest path 
between two points is a 










peated applications of t: 





Orthonormal Bases 


107 


6.16 Corollary: Every orthonormal list of vectors is linearly inde¬ 
pendent. 

Proof: Suppose (ei，■ ■ ■ ， e m ) is an orthonormal list of vectors in V 
and ai，■ ■ ■ ， a m g F are such that 

d\Ci + ■ ■ ■ + ct m e m = 0 . 

Then |ai| 2 + ■ ■ ■ + \a m \ 2 = 0 (by 6.15), which means that all the a/s 
are 0, as desired. ■ 

An orthonormal basis of V is an orthonormal list of vectors in V 
that is also a basis of V. For example, the standard basis is an ortho- 
normal basis of F n _ Every orthonormal list of vectors in V with length 
dim V is automatically an orthonormal basis of V (proof: by the pre¬ 
vious corollary, any such list must be linearly independent; because it 
has the right length, it must be a basis—see 2.17). To illustrate this 
principle, consider the following list of four vectors in R 4 : 

The verification that this list is orthonormal is easy (do it!); because we 
have an orthonormal list of length four in a four-dimensional vector 
space, it must be an orthonormal basis. 

In general, given a basis (ei ， ■ ■ .,e n ) of V and a vector v e V, we 
know that there is some choice of scalars ai,..., a m such that 

v = UiCi + ■ ■ ■ + u n e nf 

but finding the a/s can be difficult. The next theorem shows, however, 
that this is easy for an orthonormal basis. 

6.17 Theorem: Suppose (ei, ■ ■ ■, e n ) is an orthonormal basis of V. 
Then 

6.18 
and 
6.19 


v = (v,ei)ei + ■ ■ _ + (v,e n )e n 

llvf^Kv,^)! 2 + ■■■ + |(v,e w >| 2 


The importance of 
orthonormal bases 
stems mainly from this 
theorem. 


for every v e V. 







Schmidt (1876-1959) 



Proof: Suppose (vi,...,v m ) is a linearly independent list of 
tors in V. To construct the e’s, start by setting e\ = vi/||vi||. r 
satisfies 6.21 for j = 1. We will choose 《 2 , ■ ■ ■ ， inductively, as 
lows. Suppose j > l and an orthornormal list (ei, … ， ej-i) has t 














6.24 Corollary: Every finite-dimensional inner-product space has an 
orthonormal basis. 

Proof: Choose a basis of V. Apply the Gram-Schmidt procedure 


extended to an orthonormal basis. 
Schmidt procedure shows that such a: 






some basis of V ， th 


matrix with respect to some orthonormal basis of V■ 


Proof: Suppose T has an upper-triangular matrix 
some basis (vi , …， v n ) of V. Thus span(vi, …， Vj) is 
T for each j = (see 5.12). 

Apply the Gram-Schmidt procedure to (vi,...,v n ； 
orthonormal basis (ei . e^) of V. Because 











Orthogonal Projections and Minimization Problems 


111 


for each j (see 6.21), we conclude that span(^i, … ， ej) is invariant un¬ 
der T for each j = 1,... ,n. Thus, by 5.12, T has an upper-triangular 
matrix with respect to the orthonormal basis (ei, ■ ■ ■, e n ). ■ 

The next result is an important application of the corollary above. 


6.28 Corollary: Suppose V is a complex vector space and T g £(V). 
Then T has an upper-triangular matrix with respect to some orthonor¬ 
mal basis of V. 

Proof: This follows immediately from 5.13 and 6.27. ■ 

OrtfiogonaC Trojectkms and 
Minimization Tro6(ems 


This result is 
sometimes called 
Schur’s theorem. The 
German mathematician 
Issai Schur published 
the first proof of this 
result in 1909. 


If [7 is a subset of V, then the orthogonal complement of [/, de¬ 
noted [/ 丄 ， is the set of all vectors in V that are orthogonal to every 
vector in U: 


U L = {v e V : (v,u) = 0 for all it e [/}. 

You should verify that [/丄 is always a subspace of V, that = {0}, 
and that {0 }丄 = V. Also note that if Ui c [/ 2 , then d t/^. 

Recall that if U\ t [/ 2 are subspaces of V, then V is the direct sum of 
U\ and f/ 2 (written V = J7i © R) if each element of V can be written in 
exactly one way as a vector in Ui plus a vector in U 2 . The next theorem 
shows that every subspace of an inner-product space leads to a natural 
direct sum decomposition of the whole space. 

6.29 Theorem: If U is a subspace of V, then 

V = [7® [/ 丄 . 

Proof: Suppose that [7 is a subspace of V. First we will show that 

6.30 V = U + U ± . 


To do this, suppose v e V. Let (ei,■■■, e m ) be an orthonormal basis 
of U. Obviously 















114 


Chapter 6. Inner-Product Spaces 


where 6.38 comes from the Pythagorean theorem (6.3), which applies 
because v - Puv e t /丄 and Puv -u eU. Taking square roots gives the 
desired inequality. 

Our inequality is an equality if and only if 6.37 is an equality, which 
happens if and only if \\PuV - u\\ = 0, which happens if and only if 
u = PuV- ■ 


v 



The last proposition is often combined with the formula 6.35 to 
compute explicit solutions to minimization problems. As an illustra¬ 
tion of this procedure, consider the problem of finding a polynomial u 
with real coefficients and degree at most 5 that on the interval 
approximates sin x as well as possible, in the sense that 



\sinx -u(x)\ 2 dx 


is as small as possible. To solve this problem, let C[-tt ， tt] denote the 
real vector space of continuous real-valued functions on [-tt, tt] with 
inner product 


6.B9 [f,g) = \ f(x)g(x)dx. 

J-TT 

Let v e C[-Tr,Tr] be the function defined by v(x) = sinx. Let U 
denote the subspace of consisting of the polynomials with 

real coefficients and degree at most 5. Our problem can now be re¬ 
formulated as follows: find u g U such that ||v - u\\ is as small as 
possible. 

To compute the solution to our approximation problem, first apply 
the Gram-Schmidt procedure (using the inner product given by 6.39) 






Orthogonal Projections and Minimization Problems 


115 


to the basis (1,x,x 2 , x 3 t x 4 ,x 5 ) of [/, producing an orthonormal basis 
(^ 1 ,^ 2 , ^ 3 , ^ 4 , ^ 5 , ^6) of U. Then, again using the inner product given 
by 6.39, compute Puv using 6.35 (with m = 6). Doing this computation 
shows that P\jv is the function 


6.40 0.987862X - 0.155271x 3 + 0.00564312X 5 , 


where the tt’s that appear in the exact answer have been replaced with 
a good decimal approximation. 

By 6.36, the polynomial above should be about as good an approxi¬ 
mation to sinx on [-tt, tt] as is possible using polynomials of degree 
at most 5. To see how good this approximation is, the picture below 
shows the graphs of both sinx and our approximation 6.40 over the 
interval [-tt, tt]. 



Our approximation 6.40 is so accurate that the two graphs are almost 
identical—our eyes may see only one graph! 

Another well-known approximation to sin x by a polynomial of de¬ 
gree 5 is given by the Taylor polynomial 


6.41 



~S\ 


To see how good this approximation is, the next picture shows the 
graphs of both sinx and the Taylor polynomial 6.41 over the interval 

[-TT,TT]. 


A machine that can 
perform integrations is 
useful here. 





116 


Chapter 6. Inner-Product Spaces 



Graphs of sinx and the Taylor polynomial 6.41 


The Taylor polynomial is an excellent approximation to sinx for x 
near 0. But the picture above shows that for \x\ > 2, the Taylor poly¬ 
nomial is not so accurate, especially compared to 6.40. For example, 
taking x = 3, our approximation 6.40 estimates sin 3 with an error of 
about 0.001, but the Taylor series 6.41 estimates sin 3 with an error of 
about 0.4. Thus at x = 3, the error in the Taylor series is hundreds of 
times larger than the error given by 6.40. Linear algebra has helped us 
discover an approximation to sinx that improves upon what we learned 
in calculus! 

We derived our approximation 6.40 by using 6.35 and 6.36. Our 
standing assumption that V is finite dimensional fails when V equals 
C[-Tr,Tr], so we need to justify our use of those results in this case. 
First, reread the proof of 6.29, which states that if [7 is a subspace of V, 
then 


6.42 


V = [/©[/ 丄 . 


If we allow V to be 
infinite dimensional 
and allow U to be an 
infinite-dimensional 
subspace ofV, then 
6.42 is not necessarily 
true without additional 
hypotheses. 


Note that the proof uses the finite dimensionality of U (to get a basis 
of U) but that it works fine regardless of whether or not V is finite 
dimensional. Second, note that the definition and properties of Pu (in¬ 
cluding 6.35) require only 6.29 and thus require only that U (but not 
necessarily V) be finite dimensional. Finally, note that the proof of 6.36 
does not require the finite dimensionality of V. Conclusion: for v e V 
and U a subspace of V, the procedure discussed above for finding the 
vector u e U that makes ||v-ii|| as small as possible works if U is finite 
dimensional, regardless of whether or not V is finite dimensional. In 
the example above U was indeed finite dimensional (we had dim U = 6), 
so everything works as expected. 





_Linear Functionals and Adjoints_11 7 

Linear Junctionals and JKdjoints 














Linear Functionals and Adjoints 


119 


T*(yi,y 2 ) = (2y 2 ,yi,3yi). 


Note that in the example above, T* turned out to be not just a func¬ 
tion from R 2 to R 3 , but a linear map. That is true in general. Specif¬ 
ically, if T e £(V, W), then T* e £(W f V). To prove this, suppose 
T € £(V t W). Let’s begin by checking additivity. Fix wi,W 2 ^ W. 
Then 


Adjoints play a crucial 



results in the next 
chapter. 


(Tv,Wi + W2) = (Tv,Wi) + (TV,W2) 

= <v,r*wi> + <v,r*w 2 > 

= {v,T^w 1 + T^w 2 ), 

which shows that T*wi + T*W 2 plays the role required of T* (^ 1 +^ 2 ). 
Because only one vector can behave that way, we must have 

T^Wi + T^W2 = T* (Wi + W2). 

Now let’s check the homogeneity of T*. If a e F, then 

(Tv, aw) = a{Tv,w) 

=a(v, 

=(v,ar*w>, 

which shows that aT*w plays the role required of T*(aw). Because 
only one vector can behave that way, we must have 

= T^(aw). 

Thus T* is a linear map, as claimed. 

You should verify that the function T r* has the following prop¬ 
erties: 


additivity 

(5 4 - T)* = 5* + T* for all 5, T e £(V, W)\ 

conjugate homogeneity 

(aT)* = aT* for all a e F and T e X(V, W)\ 

adjoint of adjoint 

(T*)* = T for all T e X(V, W)\ 

identity 

/* = /, where I is the identity operator on V; 







id 5 e £(W t U) (here U is an 

between the null space and 
he symbol means “if and 
mean “is equivalent to”. 

).Then 


w € ： W. Then 

0 for all v e V 
for all v e V 
























proposition: suppose 1 g l^v,w). ii (ei,...,e n j is an or¬ 


thonormal basis of V and (/i, ■ ■ ■ ,/m) is an orthonormal basis of W, 
then 

M(T^, (/i, …, /m), (ei, …, e n )) 

Is the conjugate transpose of 


^1{T, (^i,..., €n)i (fh - - - j fm )) - 


Proof: Suppose that (ei, ■■■, e n ) is an orthonormal basis of V and 
； /i,..., f m ) is an orthonormal basis of W. We write 3i(T) instead of the 
onger expression 31 (T, (ei ,..., e n ), (/i，■ ■ ■ ， / m ))； we also write 
nstead of M{T^, (/i,...,/ m ), (e 1 ,...,e n ))- 

Recall that we obtain the fe th column ofM(T) by writing Te/c as a lin¬ 
ear combination of the f/s; the scalars used in this linear combination 
hen become the k th rolmrm of Rerau.se ( fi is an ortho- 


lhe aajomt ol 
map does not ( 
on a choice of 
This explains \ y 
will emphasize adjoint 
of linear maps instead 
of conjugate 
transposes of matrices 


M{T). 













8 . A norm on a vector space [/ is a function || II: [/ — [0, oo) such 
that ||till = 0 if and only if u = 0, ||cxii|| = |cx|||ii|| for all cx e F 
and all u g [/, and \\u + v|| < \\u\\ + ||v|| for all u, v g U. Prove 
that a norm satisfying the parallelogram equality comes from 
an inner product (in other words, show that if || || is a norm 
on U satisfying the parallelogram equality, then there is an inner 
product 〈， > on U such that ||ii|| = {u,u) 1/2 for all ti e U). 

9. Suppose n is a positive integer. Prove that 

(1 sinx sin2x sinnx cosx cos 2x cos nx \ 

is an orthonormal list of vectors in C[-tt ， tt] ，the vector space of 
continuous real-valued functions on [-tt, tt] with inner product 

</.^> = \ f(x)g(x)dx. 

J-n 

10. On :? 2 (R) ， consider the inner product given by 

(p.a) = [ p(x)q(x)dx. 

Jo 


This orthonormal list is 
often used for 
modeling periodic 
phenomena such as 
tides. 


Apply the Gram-Schmidt procedure to the basis (1 ， x ， x 2 ) to pro¬ 
duce an orthonormal basis of 1 ? 2 (R). 





p to〆）on T 2 (R) has an upper-triangular matrix with 
this basis. 


15. Suppose [7 is a subspace of V. Prove that 

dimt / 丄 =dimV - dim[7. 

16. Suppose [/ is a subspace of V. Prove that U- 1 = {0} if and only if 
U = V. 

17. Prove that if P e £(V) is such that P 2 = P and every vector 
in null is orthogonal to every vector in range P, then P is an 
orthogonal projection. 

18. Prove that if P e £(V) is such that P 2 = P and 

IIPvll < ||v|| 

for every v e V, then P is an orthogonal projection. 

19. Suppose T e £(V) and [/ is a subspace of V. Prove that U is 
invariant under T if and only if PuTPu = TP\j. 


)ssible. 




nt approx- 








Exercises 


125 


24. Find a polynomial q e T 〗 (R) such that 

1 f 1 

p(^) = p(x)q(x)dx 

2 Jo 

for every p g !? 2 (R). 

25. Find a polynomial q GT 2 (R) such that 

P(x)(costtx) dx = p(x)q(x) dx 
Jo Jo 

for every p g !? 2 (R). 

26. Fix a vector v e V and define T e X(V,F) by Tu = (u,v). For 
a e F, find a formula for T*a. 

27. Suppose n is a positive integer. Define T e X(F n ) by 

T(Z\, . .., Z n ) = (0, Zi,..., Zn-l)- 
Find a formula for T*(zi,...,z n ). 

28. Suppose T e £(V) and A e F. Prove that A is an eigenvalue of T 
if and only if A is an eigenvalue of T*. 

29. Suppose T e £(V) and [/ is a subspace of V. Prove that U is 
invariant under T if and only if [/丄 is invariant under T*. 

30. Suppose T e £(V,W). Prove that 

(a) T is injective if and only if T* is surjective; 

(b) T is surjective if and only if T* is injective. 

31. Prove that 

dim null T* = dim null T + dimW - dimV 

and 

dim rangeT* = dim range T 
for every T e £(V, W). 

32. Suppose A is an m-by-n matrix of real numbers. Prove that the 


is the 






Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



Chapter 7 


Operators on 
Inner-Troduct Spaces 



V is a finite-dimensional, nonzero, inner-product space over F. 


氺伞氺氺 







28_ Chapter 7. Operators on Inner-Product Spaces 

Se^-^Ac§oint and ^Nbrrnaf Operators 

Instead of self-adjoint, An operator T G £(V) is called self-adjoint if T = T*. For 






Self-Adjoint and Normal Operators 


129 


7.2 Proposition: If V is a complex inner-product space and T is an 
operator on V such that 

(Tv,v) = 0 

for all v e V, then T = 0. 


Proof: Suppose V is a complex inner-product space and T e £(V). 
Then 


^ 、 {T(u + w),u + w) - {T(u-w),u-w) 

(Tu f w)= --- 

(T(u + iw),u + iw) - (T(u - iw),u - iw ), 

4 

for allu.w e V, as can be verified by computing the right side. Note 
that each term on the right side is of the form <Tv, v> for appropriate 
v e V. If <Tv, v> = 0 for all v e V, then the equation above implies that 
{Tu f w) = 0 for all u,w ^V. This implies that T = 0 (take w = Tu). m 


The following corollary is false for real inner-product spaces, as 
shown by considering any operator on a real inner-product space that 
is not self-adjoint. 


7.3 Corollary: Let V be a complex inner-product space and let 
T e £(V). Then T is self-adjoint if and only if 

<Tv,v> gR 


for every v e V. 


This corollary provides 
another example of 



operators behave like 
real numbers. 


Proof: Let v e V. Then 


(Tv,v) - {Tv,v) = (Tv,v) - {v,Tv) 

= (Tv,v)-(T^v 1 v) 

= UT-T*)v,v). 

If {Tv,v) g R for every v e V, then the left side of the equation above 
equals 0, so <(T - T*)v,v) = 0 for every v e V. This implies that 
T - T* = 0 (by 7.2), and hence T is self-adjoint. 

Conversely, if T is self-adjoint, then the right side of the equation 
above equals 0, so <Tv,v) = (Tv, v> for every v e V. This implies that 
(Tv,v) e R for every v e V, as desired. ■ 
























Proof: Suppose T g £(V) is normal and a, 0 are distinct eigen¬ 
values of T, with corresponding eigenvectors u,v. Thus Tu = au and 
Tv = ^v. From 7.7 we have T*v = $v. Thus 

(a-^){u,v) = (au,v) - {u,^v) 

=(Tu.v) - {u,T^v) 

= 0 . 

Because ^ the equation above implies that (u,v) = 0. Thus u and 
v are orthogonal, as desired. ■ 


The Sj)ectra£ Theorem 





















a complex Because every 

rmal basis self-adjoint operator is 








134 


Chapter 7. Operators on Inner-Product Spaces 


(because 0 - 1,2 = 0, as we showed in the paragraph above) and 

11^*62 II 2 = \a 2 , 2 \ 2 + |a 2 , 3 | 2 + ■ ■ ■ + \a 2 ,n\ 2 - 

Because T is normal, \\Te 2 W = ||T*e 2 l|. Thus the two equations above 
imply that all entries in the second row of the matrix in 7.10, except 
possibly the diagonal entry a 2 , 2 , equal 0. 

Continuing in this fashion, we see that all the nondiagonal entries 
in the matrix 7.10 equal 0, as desired. ■ 

We will need two lemmas for our proof of the real spectral theo¬ 
rem. You could guess that the next lemma is true and even discover its 
proof by thinking about quadratic polynomials with real coefficients. 
Specifically, suppose a, )3 g R and a 2 < 4^. Let x be a real number. 
This technique of Then 

completing the square ^ 

can be used to derive x 2 + ax + ^ = (x + —) 2 + ()S 一 - —) 

the quadratic formula. > ^ 

In particular, x 2 + ax + ^ is an invertible real number (a convoluted 
way of saying that it is not 0). Replacing the real number x with a 
self-adjoint operator (recall the analogy between real numbers and self- 
adjoint operators), we are led to the lemma below. 


7.11 Lemma: Suppose T e £(V) is self-adjoint. If a, P gR are such 
that a 2 < 40, then 

T 2 + aT + ^1 

is invertible. 


Proof: Suppose gR are such that a 2 < 4)S. Let vbea nonzero 
vector in V. Then 


<(T 2 + (xT + )S/)v,v> = <T 2 v,v> + a{Tv,v) + )S(v, v> 

=(Tv,Tv) + a(Tv,v) + P\\v\\ 2 
>||Tv|| 2 -|a|||Tv||||v||+)3||v|| 2 

= (I|Tv||-^^) 2 +()S-^)||v || 2 

> 0 , 




























tius adjoining u to an orthonormal basis 
>rs of 5 gives an orthonormal basis of V 
r, as desired. ■ 








Mnrnaf Operators on "Reaf 
Inner-TrocCuct Spaces 


The complex spectral theorem (7.9) gives a complete 






⑻ 

(b) 


(c) 


the matrix of T with respect to some orthoni 
the form 


with b > 0. 


Proof: First suppose that (a) holds, so that : 
self-adjoint. Let (ei, e^) be an orthonormal basis o 


7.16 


M(T i (e li e 2 ))= 


b d 


Then \\TexW 2 = a 2 + b 2 and ||T*ei|| 2 = a 2 + c 2 . B 
IITeill = ||T*ei|| (see 7.6); thus these equations 
Thus c = b or c = -b. But c + b because otherw 



















































Proof ： Tc 
orthonormal t 


Tositvve Operators 


Many ma thema ticians 
also use the term 
positive semidefinite 
operator, which means 
the same as positive 
operator. 


An operator T e £(V) is called positive if T is self-adjoint and 
(Tv,v> >0 

for all v V. Note that if V is a complex vector space, then the 
condition that T be self-adjoint can be dropped from this definition 
(by 7.3). 

















itive. For 
e showed 
a 2 < 4P, 


: S 2 = T. 

， 0)，then 
a square 

perators. The positive operators 






First suppose that (a) holds, so that T is positive. Obviously T is 

self-adjoint (by the definition of a positive operator). To prove the other 

condition in (b), suppose that A is an eigenvalue of T. Let vbea nonzero 
eigenvector of T corresponding to A. Then 

0<<Tv,v> 

=<Av,v) 

=A<v,v), 

and thus A is a nonnegative number. Hence (b) holds. 

Now suppose that (b) holds, so that T is self-adjoint and all the eigen¬ 
values of T are nonnegative. By the spectral theorem (7.9 and 7.13), 

























that must equal some Aj. In other words, a = yAj for some j. 
Furthermore, 7.30 implies that 

7.31 nuH(5 - yjAjl) c null(T - AjI). 

In the paragraph above, we showed that the only possible eigenval¬ 
ues for 5 are V^T ， … ， VAm- Because 5 is self-adjoint, this implies that 

7.32 V = null(5 - ^1) e ■ ■ ■ e null ( 夂 - 



for each j. In other words, on null(T - Aj/), the operator 5 is just 
multiplication by ^J\j. Thus 5, the positive square root of T, is uniquely 
determined by T. ■ 


An operator 5 e £(V) is called an isometry if 
Il5v|| = ||v|| 


means ( 
word rr, 


for all v gV. In other words, an operator is an isometry if it preserves 
norms. For example, A/ is an isometry whenever A e F satisfies |A| = 1. 
More generally, suppose Ai,..., A n are scalars with absolute value 1 and 
S g £(V) satisfies 5 (^j) = Xjej for some orthonormal basis (ei, ...,e n ) 
of V. Suppose v gV. Then 


measure 
isometry 
means e 、 



148 


Chapter 7. Operators on Inner-Product Spaces 


where we have used 6.17. Applying 5 to both sides of 7.33 gives 
Sv = {v,ei)Sei + ■ ■ _ + (v,e n )Se n 

=\i{v,ei)ei + ■■- + 入竹 〈!/，〜〉“■ 


The last equation, along with the equation |Aj| = 1, shows that 

7.35 l|5v|| 2 = |(v,ei>| 2 + ■ - - + \{v,e n )\ 2 . 

Comparing 7.34 and 7.35 shows that ||v|| = ||5v||. In other words, 5 is 
an isometry. 

An isometry on a real For another example, let 0 gR. Then the operator on R 2 of coun- 











149 


(g) = (u, v> for all u,v e V; 

(h) 55* = I; 

(i) (5*^1,..., 5*e n ) is orthonormal whenever (eu ... ,e n ) is an or¬ 
thonormal list of vectors in V; 

(j) there exists an orthonormal basis (ei 1 ... 1 e n ) of V such that 
(5*^1,..., S^e n ) is orthonormal 

Proof: First suppose that (a) holds. If V is a real inner-product 
space, then for every u,v gV we have 

(Su,Sv) «|||Sw + 5v|| 2 - || 5 m - 5 v || 2 )/4 
=(||5(w + v)|| 2 - || 5 ( w - v )|| 2 )/4 
二 (||w + v|| 2 - ||w - v || 2 )/4 
=(w,v>, 

where the first equality comes from Exercise 6 in Chapter 6, the second 
equality comes from the linearity of 5, the third equality holds because 


ipter 6. If V is a complex inner-product space 
: hapter 6 instead of Exercise 6 to obtain the 
ler case, we see that (a) implies (b). 

Now suppose that (b) holds. Then 

{(S*S-I)u,v) = (Su,Sv) - (u 
= 0 

every u,v ^ V. Taking v = (5*5 - I)u, we 5 
ice 5*5 = /, proving that (b) implies (c). 

Now suppose that (c) holds. Suppose 
of vectors in V. Then 

(Sej,Se k ) = {S^Se j9 e k ) 

={ej'e k ). 

ice (5ei, ■ ■ ■, Se n ) is orthonormal, proving that 
Obviously (d) implies (e). 

Now suppose (e) holds. Let (ei, ■ ■ ■, e n ) be an or 
h that (Sei . Se^) is orthonormal. IfvGV. 






Il5v|| 2 = ||S((v,ei)ei + … + {v,e n )e n )\\ 2 
=ll(v,ei)5ei + … + {v,e n )Se n \\ 2 
=|(v,e 1 )| 2 + --- + |(v,e w >| 2 
=llvll 2 , 

where the first and last equalities come from 6.17. Taking square roots, 
we see that 5 is an isometry, proving that (e) implies (a). 

Having shown that (a) => (b) => (c) =» (d) => (e) => (a), we know at this 


—— 


Proof: We alread 
that if there is an orth 




























Continuing w 
0 can be wr 













Polar and Singular-Value Decompositions 


153 


where the first factor, namely, z/|z|,isan element of the unit circle. Our 
analogy leads us to guess that any operator T e £(V) can be written 
as an isometry times vT*T. That guess is indeed correct, as we now 
prove. 

7.41 Polar Decomposition: If T g £(V) f then there exists an isom¬ 
etry S e £(V) such that 

Proof: Suppose T e £(V). If v e V, then 

|| Tv if* (Tv, Tv) 

= {T*Tv,v) 



= (VT*Tv,VT*Tv> 
=-I|yT*Tv|| 2 . 


Thus 

7.42 || Tv || = WVf^TvW 

for all v gV. 

Define a linear map Si: range VT^T 一 range T by 
7.4B 5i(ii^Tv) = Tv. 

The idea of the proof is to extend 5i to an isometry 5 e £(V) such that 
T = SVT^T. Now for the details. 

First we must check that 5i is well defined. To do this, suppose 
Vi ， V 2 e V are such that y/T^T V 2 . For the definition given 
by 7.43 to make sense, we must show that Tvi = Tv 2 . However, 

lITvi - = ||T(n - v 2 || 

=llVr^TCv! -v 2 )|| 

= \\^/T*Tv 1 - Vt*Tv 2 \\ 

= 0 , 

where the second equality holds by 7.42. The equation above shows 
that Tv\ = TV 2 , so 5i is indeed well defined. You should verify that Si 
is a linear map. 


If you know a bit of 
complex analysis, you 
will recognize the 
analogy to polar 
coordinates for 
complex numbers: 
every complex number 
can be written in the 
form e 0i r f where 
0 g [0, 2tt) and r >0. 
Note that e ei is in the 



being an isometry, and 
r is nonnegative, 
corresponding to 
VT*T being a positive 
operator. 






7.44 


v = 


where u g range VT*T and w e (rang 
with decomposition as above, define 5 

Sv = S\U + 


For each v e V we have 

S(y/T^Tv) =5i(V7 

soT = SVT^T, as desired. All that rem 
etry. However, this follows easily from 
theorem: if v e V has decomposition e 

IISvIl 2 = \\S lU + . 
= II5im|| 2 

=llwll 2 + 
= llvll 2 , 





















156 


Chapter 7. Operators on Inner-Product Spaces 


This proof illustrates 
the usefulness of the 
polar decomposition. 


7.46 Singular-Value Decomposition: Suppose T e £(V) has sin¬ 
gular values Then there exist orthonormal bases (ei, ■ ■ ■, e n ) 

and (/i,". ,/ n ) ofV such that 

7.47 Tv =5i(v,ei)/i + … +s n (v,e n )f n 
for every v e V. 


Proof: By the spectral theorem (also see 7.14) applied to 
there is an orthonormal basis (ei，■ ■ ■ ， e n ) of V such that vT*T ej = Sjej 
for j = 1 ， …， n. We have 

v = {v,ei)ei + ■ ■ ■ + {v,e n )e n 

for every v e V (see 6.17). Apply vT*T to both sides of this equation, 
getting 

VT*Tv = si{v,ei)ei + … +s n {v,e n )e n 
for every v €： V. By the polar decomposition (see 7.41), there is an 
isometry 5 e £(V) such that T = SVT*T. Apply 5 to both sides of the 
equation above, getting 

Tv = Si{v,ei)Sei + … + s n {v,en)Se n 

for every v e V. For each j, let fj = Sej. Because 5 is an isometry, 
(/i ， … ， fn) is an orthonormal basis of V (see 7.36). The equation above 
now becomes 


Tv = Si{v,ei)fi + … + 5 m (v, e n )f n 
for every v e V, completing the proof. 


When we worked with linear maps from one vector space to a second 
vector space, we considered the matrix of a linear map with respect 
to a basis for the first vector space and a basis for the second vector 
space. When dealing with operators, which are linear maps from a 
vector space to itself, we almost always use only one basis, making it 
play both roles. 

The singular-value decomposition allows us a rare opportunity to 
use two different bases for the matrix of an operator. To do this, sup¬ 
pose T e £{V). Let ,5 n denote the singular values of T, and let 
(ei 1 ... 1 e n ) and (/i,■ ■ ■,/n) be orthonormal bases of V such that the 
singular-value decomposition 7.47 holds. Then clearly 



rators). The nonnegative 
les of T*T will be the (ap- 
n from the proof of 7.28). 
be approximated without 












"Exercises 

1. Make ^(R) into an inner-product space by defining 

{p,q) = [ p(x)q(x) dx. 

Jo 

Define T e £(T 2 (R)) by T(ao % a\X + a 2 x 2 ) = a\X. 

(a) Show that T is not self-adjoint. 

(b) The matrix of T with respect to the basis (l.x.x 2 ) is 

0 0 0 ' 

0 10 . 

0 0 0 _ 

This matrix equals its conjugate transpose, even though T 
is not self-adjoint. Explain why this is not a contradiction. 

2. Prove or give a counterexample: the product of any two self- 

adjoint operators on a finite-dimensional inner-product space is 

self-adjoint. 

3. (a) Show that if V is a real inner-product space, then the set 

of self-adjoint operators on V is a subspace of £(V). 

(b) Show that if V is a complex inner-product space, then the 
set of self-adjoint operators on V is not a subspace of 
L(V). 


4. 


Suppose P e £(V) is such thatP 2 = P. Prove that P is an orthog- 





without the hypothesis 
that T is normal. 

： on V is positive, 
or every positive 











lat Ti and 乃 hav 
exist isometries 5i 


lue decomposition 


Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



Chapter 8 

Operators on 
Complex Sector Spaces 


In this chapter we delve deeper into the structure of operators on 
complex vector spaces. An inner product does not help with this ma¬ 
terial, so we return to the general setting of a finite-dimensional vector 
space (as opposed to the more specialized context of an inner-product 

space). Thus our assumptions for this chapter are as follows: 

Recall that F denotes R or C. 

Also, V is a finite-dimensional, nonzero vector space over F. 

Some of the results in this chapter are valid on real vector spaces, 

so we have not assumed that V is a complex vector space. Most of the 

results in this chapter that are proved only for complex vector spaces 

have analogous results on real vector spaces that are proved in the next 

chapter. We deal with complex vector spaces first because the proofs 

on complex vector spaces are often simpler than the analogous proofs 
on real vector spaces. 

氺❖氺 

163 














164_Chapter 8. Operators on Complex Vector Spaces 

QenerafizecC "Eigenvectors 
























ist a nonnegative integer m such that null' 
sition below shows that this equality holds 
dimension of the vector space on which T 

8.6 Proposition: If T g £(V) f then 

nullT dimV = nuUT dlmV+l = nu] 

Proof: Suppose T g £(V). To get our 
only prove that null = null i 

true. Then, by 8.5, we have 

{0} = nullT 0 cnullT 1 9 . . . c null 

where the symbol s means “contained in t 
the strict inclusions in the chain above, the 
at least 1. Thus dim null > dimV 

a subspace of V cannot have a larger dime 

Now we have the promised description 


corollary implies 8.7 Corollary: Suppose T G £(V) and 






means 


perator of dif- potent literally 
i st derivative of means zero P° wer - 

on this space of 
or to the power 
r need to use a 

nN d{mV = 0. 

s a generalized 
rom 8.7 we see 


■s，we now turn 
a nonnegative 


become equal- 


These inclusions 










The Cfxaracteristic ToCynamiaf 


Suppose V is a complex vector space and T e £(V). We know that 
V has a basis with respect to which T has an upper-triangular matrix 
(see 5.13). In general, this matrix is not unique—V may have many 
different bases with respect to which T has an upper-triangular matrix, 















The Characteristic Polynomial 


169 


8.10 Theorem: Let T e £(V) and A e F . Then for every basis of V 
with respect to which T has an upper-triangular matrix, A appears on 
the diagonal of the matrix of T precisely dim null (T - times. 


Proof: We will assume, without loss of generality, that A = 0 (once 
the theorem is proved in this case, the general case is obtained by re¬ 
placing T with T - A/). 

For convenience let n = dim V. We will prove this theorem by induc¬ 
tion on n. Clearly the desired result holds if n = 1. Thus we can assume 
that n > 1 and that the desired result holds on spaces of dimension 
n - 1. 

Suppose (vi,...,v n ) is a basis of V with respect to which T has an 
upper-triangular matrix Recall that an asterisk 

is often used in 
matrices to denote 
entries that we do not 
know or care about. 


Let U = span(vi,...,v n -i). Clearly U is invariant under T (see 5.12), 
and the matrix of Tl^ with respect to the basis (vi,... ,v n _i) is 


8.1 


入 n-l 


An 


8.12 


Ai 氺 

0 A n _i 


Thus, by our induction hypothesis, 0 appears on the diagonal of 8.12 
dimnull(r|[；) n — 1 times. We know that null(T|[/) n_1 = mill(T\u) n (be¬ 
cause U has dimension n — 1; see 8.6). Hence 


8.1 B 0 appears on the diagonal of 8.12 dimnull(r|[/) n times. 

The proof breaks into two cases, depending on whether A n = 0. First 
consider the case where A n ^ 0. We will show that in this case 

8.14 nullT n c U. 

Once this has been verified, we will know that null T n = null(T|[/) n , and 
hence 8.13 will tell us that 0 appears on the diagonal of 8.11 exactly 
dim null T n times, completing the proof in the case where A n ^ 0. 

Because M(T) is given by 8.11, we have 







dimn 


Suppose w 




vector ii 


that is 




















triangular (by 5.13). The 
appears on the diagonal c 
























nullp(r) 


Proof ： 
p(T)v = 0. 


lp(T). The 


of which is a nilpotent operator plus a scalar multiple of the identity. 
Actually we have already done all the hard work, so at this point the 
proof is easy. 

8.2B Theorem: Suppose V is a complex vector space and T e £(V). 
Let An ■ ■ ■ ， 入抓 be the distinct eigenvalues of T, and let [/i, . .. , [7 m be 
the corresponding subspaces of generalized eigenvectors. Then 

⑻ V = t/i ㊉ ■ ■ ■ ® U m ; 

(b) each Uj is invariant under T; 

(c) each (T - \jl) \ \j. is nilpotent. 

Proof: Note that Uj = null(T - \jI) dimV for each j (by 8.7). From 
8.22 (with p{z) = (z - A J *) dimV ), we get (b). Obviously (c) follows from 
the definitions. 

To prove (a), recall that the multiplicity of A j as an eigenvalue of T 
is defined to be dim Uj . The sum of these multiplicities equals dimV 
(see 8.18); thus 

8.24 dimV = diml/i + … + dimU m . 

Let [/ = [/i + ■ ■ ■ + U m . Clearly U is invariant under T. Thus we can 
define 5 e £(U) by 

S = T\ V . 

Note that 5 has the same eigenvalues, with the same multiplicities, as T 















Lumns comes rrom basis vectors in nulliV^. Apply] 
^ctor, we get a vector in null JV; in other words, we 
a linear combination of the previous basis vectors, 
itries in these columns must lie above the diagonal 
lumns come from basis vectors in nullN 3 . Applyi 
ictor, we get a vector in null N 2 ; in other words, we 


a linear combination of the previous basis vectors. 

























Continue in this i 
aj so that the coe 
equals 0. Actual] 
a/s. We need on 
root of I + N. 

The previous 
However, the nex 

On real vector spaces 8.B2 Theorem: 
















The Minimal Polynomial 


179 


The MinimaC ToCyrwmiaC 

As we will soon see, given an operator on a finite-dimensional vec¬ 
tor space, there is a unique monic polynomial of smallest degree that 
when applied to the operator gives 0. This polynomial is called the 
minimal polynomial of the operator and is the focus of attention in 
this section. 

Suppose T e X(V), where dimV = n. Then 
(I,T,T 2 ,...,T n2 ) 

cannot be linearly independent inX(V) because L{V) has dimension n 2 


A monic polynomial is 

a polynomial whose 
highest degree 
coefficient equals 1. 
For example, 

2 + 3z 2 + z 8 is a monic 
polynomial. 














ilgorithm (4.5), there exist polynomials s 
LB5 q = sp +r 

md degr < deg p. We have 


0 = q(T)=s(T)p(T) + 











Proof: Let 


The Minimal Polynomial 


181 


p(z) = ao + a\z + azz 2 + ■ ■ ■ + + z m 

be the minimal polynomial of T. 

First suppose that A e F is a root of p. Then the minimal polynomial 
of T can be written in the form 

P(z) = (z-\)q(z), 

where ^ is a monic polynomial with coefficients in F (see 4.1). Because 
p(T) = 0, we have 

0 = (T - M)(q(T)v) 

for all v e V. Because the degree of q is less than the degree of the 
minimal polynomial p, there must exist at least one vector v e V such 

that q{T)v ^ 0. The equation above thus implies that A is an eigenvalue 

of T, as desired. 

To prove the other direction, now suppose that A e F is an eigen¬ 
value of T. Let vbea nonzero vector in V such that Tv = Av. Repeated 
applications of T to both sides of this equation show that T』v = 
for every nonnegative integer j. Thus 

0 = p(T)v = (a 0 + aiT + a 2 T 2 + ■ ■ ■ + a m - X T m ~ x + T m )v 
=(do + + Cl2^ + ■ ■ ■ + + A m )v 

=p(A)v. 

Because v ^ 0, the equation above implies that p (入 ） = 0, as desired. ■ 

Suppose we are given, in concrete form, the matrix (with respect to 
some basis) of some operator T e £(V). To find the minimal polyno¬ 
mial of T, consider 

for to = 1,2, … until this list is linearly dependent. Then find the 
scalars 勿 , ai, ❿，…， e F such that 

a 0 M(I) + aiM{T) + a 2 M(T) 2 + ■ ■ ■ + + M{T) m = 0. 

The scalars ao,ai,a 2 ,...,a m _i, 1 will then be the coefficients of the 
minimal polynomial of T. All this can be computed using a familiar 
process such as Gaussian elimination. 


You can think of this as 
a system of (dimV ) 2 
equations in m 
variables 

CLQt CLi t ... t CL m -i. 








is not 
there 




are approximately 

1.67, 0.51, 1.40, -0.12 + 1.591, -0.12 - 1.59i. 


the nonreal eigenvalues occur as a pair, with each the complex 
i of the other, as expected for the roots of a polynomial with 




















s dimension 1. 




context. 




































188 


Chapter 8. Operators on Complex Vector Spaces 


"Exercises 

1. Define T e £(C 2 ) by 

T(w,z) = (z ， 0). 

Find all generalized eigenvectors of T. 

2. Define T g X(C 2 ) by 

T(w,z) = (-z.w). 

Find all generalized eigenvectors of T. 

3. Suppose T g £(V), m is a positive integer, and v e V is such 
that T m ~ l v + 0 but T m v = 0. Prove that 

is linearly independent. 

4. Suppose T e X(C 3 ) is defined by T(zi, 22 , 23 ) = ( 22 , 23 , 0 ). Prove 
that T has no square root. More precisely, prove that there does 
not exist 5 e X(C 3 ) such that 5 2 = T. 

5. Suppose 5,T e £(V). Prove that if 5T is nilpotent, then T5 is 
nilpotent. 

6 . Suppose N e £(V) is nilpotent. Prove (without using 8.26) that 
0 is the only eigenvalue of N. 

7. Suppose V is an inner-product space. Prove that if N e £(V) is 
self-adjoint and nilpotent, then N = 0. 

8 . Suppose N e £(V) is such that nullN dimV_1 ^ nullJV dimV . Prove 
that N is nilpotent and that 

dim null = j 

for every integer j with 0 < j < dim V. 

9. Suppose T g £(V) and m is a nonnegative integer such that 

range T m = range T m+1 . 

Prove that range T k = range T m for all k> m. 









minimal i 












Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



Chapter 9 


Operators on 
Heafyector Spaces 



%{# %{# 













Upper-' 





















is a basis of W with respect to 
tatrix of the form 


>y-2 matrix with no eigenvalues. 
J chosen above, getting a basis 
Lee you (use 9.6) that the matrix 
: upper-triangular matrix of the 











teristic polynomial (2 
be (x - a) (x - d), ' 
now we are working ( 




















We already proved (a) in our discussion above. To prove (b). 








200 


Suppose V is a real vector space with dimension 2 andT e £(V) has 
no eigenvalues. The last proposition shows that there is precisely one 
monic polynomial with degree 2 that when applied to T gives 0. Thus, 
though T may have different matrices with respect to different bases, 
each of these matrices must have the same characteristic polynomial. 
For example, consider T e X(R 2 ) defined by 

9.8 Tixi.xz) = (3xi + 5X2, -2x\ - X 2 ). 

The matrix of T with respect to the standard basis of R 2 is 

[VU 

The characteristic polynomial of this matrix is (x - 3)(x + 1) + 2 - 5, 
which equals x 2 - 2x + 7. As you should verify, the matrix of T with 
respect to the basis ((-2,1)，（1，2)) equals 



The characteristic polynomial of this matrix is (x - l)(x - 1) + 1 ■ 6, 
which equals x 2 - 2x + 7, the same result we obtained by using the 
standard basis. 

When analyzing upper-triangular matrices of an operator T on a 
complex vector space V, we found that subspaces of the form 


null(T - A7) dlm 




































i matrices 
omial p. 


l exponent dim [/ in¬ 


s a different vector 




The proof now breaks into two cases. First consider the case where 
the characteristic polynomial of A m does not equal p. We will show 
that in this case 

115 nullp(T) n c U. 

Dnce this has been verified, we will know that 

nullpm n = nullp(T\u) n , 





































































Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



Chapter 10 

Trace and determinant 


Throughout this book our emphasis has been on linear maps and op¬ 
erators rather than on matrices. In this chapter we pay more attention 
to matrices as we define and discuss traces and determinants. Deter¬ 

minants appear only at the end of this book because we replaced their 
usual applications in linear algebra (the definition of the characteris¬ 

tic polynomial and the proof that operators on complex vector spaces 
have eigenvalues) with more natural techniques. The book concludes 
with an explanation of the important role played by determinants in 
the theory of volume and integration. 


Recall that F denotes R or C. 

Also, V is a finite-dimensional, nonzero vector space over F. 


* 

* * * * 
^0 ^0 ^0 ^0 ^0 


213 


























ps. Suppose that 


dimensional vector spaces, say U and W. Let {u\,...,u p ) be a basis 
of [/, let (vi,..., v n ) be a basis of V, and let (wi,w m ) be a basis 
of W.lfTG £(U,V) and S e X(V, W), then ST g £(U t W) and 

10.1 31 (ST, (li-i,..., lip), (wi,..., w^n))= 

31(5, (Vi,..., v n ), (Wi,..., w m ))M(T f (ui,..., Up),(Vi,..., V n )). 


The equation above holds because we defined matrix multiplication to 
make it true—see 3.11 and the material following it. 

The following proposition deals with the matrix of the identity op¬ 
erator when we use two different bases. Note that the fe th column of 
M(I, ( 从 l, ■ ■ ■, u n ), (vi, ■ ■ ■, v n )) consists of the scalars needed to write 
Uk as a linear combination of the v*s. As an example of the proposi¬ 
tion below, consider the bases ((4,2), (5,3)) and ((1,0), (0,1)) of F 2 . 
Obviously 


3^(/,((4,2),(5,3)),((1,0),(0,1))) = [ 2 3 ]■ 

The inverse of the matrix above is as you should verify. Thus 

the proposition below implies that 

^(7, ((1,0), (0,1)), ((4, 2), (5,3))) = _5 2 /2 ]■ 

10.2 Proposition: If {u\, ..., u n ) ^nd (vi,..., v n ) are bases of V f 
then M(I, (ui,u n ), (Vi, v n )) is invertible and 

M{I, (W 1 ,...,W w ), = M{I, 

Proof: In 10.1, replace U and W with V, replace Wj with Uj, and 
replace 5 and T with/, getting 
















If V is an n-dimensional real vector space and T e X(V), then t 
characteristic polynomial of T equals 

(x - \i) ...(x - \ m )(x 2 + aix + + a M x + ^ M ), 

where Ai, ■ ■ ■, A m are the eigenvalues of T and («i ， 0i), ■ ■ ■, (cxm, Pm) c 
the eigenpairs of T, each repeated according to multiplicity. Expandi 
the polynomial above, we can write the characteristic polynomial of 
in the form 


10.7 - (Ai + ■ ■ ■ + - (Xi 


- « m )x n_1 + ... 

+ (-l) m (Ai... \ m Pi... Pm) 































Proof: Suppose 


^1,1 - 

■ ■ ^-l,n 


^1,1 - _ 

^l,n 



， B = 



_ a n,l ■ 

■ ■ (^n,n _ 


_ ^n,l _ 

_ ■ ^n,n _ 


The j th term on the diagonal of AB equals 

n 

S a j ， kbk ， j- 


Thus 

n n 

trace(AB) = ^ S a J^bkj 
j=ife=i 
n n 

=S S b k， 购， k 
k=lj=l 


^ fe th term on the diagonal of BA 














= trsice{M(S)Mm) 

= trace(3KT)3i(5)) 

=trace 3i(T5) 

=trace (: TS )， 

first and last equalities come from 10.11 and the middle 
mes from 10.9. This completes the proof of the first asser- 
corollary. 

e the second assertion in the corollary, note that 


the 


consequences 
























icial result that has an easy proof with our approach. 

Proposition: An operator is invertible if and only if its deter- 
> nonzero. 

7 ： First suppose V is a complex vector space and T g £(V). 
ator T is invertible if and only if 0 is not an eigenvalue of T. 














you read this chapter, 
you may want to 
concentrate on the 
basic ideas by 
considering only 
complex vector spaces 
and ignoring the 
special procedures 
needed to deal with 
real vector spaces. 

T 2 + aT + ^I= (xI-T) 2 - (2x + a)(xl -T) + (x 2 + ax + 0)1, 

as you should verify. Thus (a f /3) is an eigenpair of T if and only if 
(-2x - a f x 2 + ax + P) is an eigenpair of xl - T. Furthermore, raising 
both sides of the equation above to the dimV power and then taking 
null spaces of both sides shows that the multiplicities are equal. ■ 

Most textbooks take the theorem below as the definition of the char¬ 
acteristic polynomial. Texts using that approach must spend consider¬ 
ably more time developing the theory of determinants before they get 
to interesting linear algebra. 

10.1 7 Theorem: Suppose T e £(V). Then the characteristic poly¬ 
nomial of T equals det(z/ - T). 

Proof: First suppose V is a complex vector space. Let Ai,..., A n 
denote the eigenvalues of T, repeated according to multiplicity. Thus 
for z e C, the eigenvalues of zl - T are z - A ： l ，■ ■ ■ ， z - A n ，repeated 
according to multiplicity. The determinant of zl -T is the product of 
these eigenvalues. In other words, 


Proof: First we need to check that (-2x- a,x 2 + ax + P) satisfies 
the inequality required of an eigenpair. We have 

(-2x - ex) 2 = 4x 2 + 4ax + a 2 
< 4x 2 + 4ax + 4/S 
= 4(x 2 + ax + ^). 

Thus (-2x - a,x 2 + ax + P) satisfies the required inequality. 

Now 


det(z/ -T) = (z- 〜） ■■ 々 -AJ. 








225 


;characteristic 
orrmlex vector 


),the 




det{xl-T) = (x-Ai)... (x-\ m )(x 2 + c 




t situation. Suppose V is a complex vector space, T e £(V), 
loose a basis of V with respect to which T has an upper- 
matrix. Then, as we noted in the last section, det T equals 
ct of the diagonal entries of this matrix. Could such a simple 
e true in general? 

unately the determinant is more complicated than the trace, 
lar, detT need not equal the product of the diagonal entries 

vith respect to an arbitrary basis. For example, the operator 

)se matrix equals 10.8 has determinant 13, as we saw in the 
n. However, the product of the diagonal entries of that matrix 

: h square matrix A, we want to define the determinant of A, 
et A, in such a way that detT = detM(T) regardless of which 
ed to compute M{T). We begin our search for the correct def- 
the determinant of a matrix by calculating the determinants 
pedal operators. 

… ， c n e F be nonzero scalars and let (vi,...,v n ) be a basis 
isider the operator T e £(V) such that M(T, (vi,..., v n )) 






















■18 (of course with a different value of n). 
that we should have 


lenumber (-l) Wl_1 ... (-l) nM_1 is called 
'i,...,p n ), denotedsign(pi, ...,p n ) (this 










the sign of the permutation 


Proof: Suppose we hav 






























the other terms in the sum 10.25 
tion. Hence det A = ai t \... a n ,n- In ( 
upper-triangular matrix equals the : 
particular, this means that if V is a 
and we choose a basis of V with res 
gular, then detT = detM(T). Our ! 
every basis of V, not just bases that 
Generalizing the computation fi 
will show that if A is a block upper- 


itA make no contribu 
the determinant of ai 
the diagonal entries. Ii 
ector space, T g £(V) 
: h M(T) is upper trian 

















































plicated than the proof of the 
i 10.9). 


B). A moment’s 
shows that AB = 

i n ,n^m n ] 

■■- ] j 











Note the similarity of Proof: Suppose (Ui ， ■ ■ ■ ， u n ) and (Vi,..., v n ) are bases of V. Let 
this proof to the proof A = M(I, (从 1 , ■ ■ ■，从 n ), (vi,..., v n )). Then 
of the analogous result 

about the trace det31(T, {u \,..., u n )) = det^A -1 (M(T, (vi, ..., v n ))A)j 

(see 10 _ 10) _ = det((M(T, (vi,..., v n ))A)A- 1 ) 

= detM(T, (vi,...,v„)), 


where the first equality follows from 10.3 and the second equality fol- 

























236 


Chapter 10. Trace and Determinant 


Proof: Suppose 5, T e £(V). Choose any basis of V. Then 



= det(W ⑸謂 r)) 

= {detM(S)){detM(T)) 

= (detS)(detT), 

where the first and last equalities come from 10.33 and the third equal¬ 
ity comes from 10.31. 

In the paragraph above, we proved that det(5T) = (det5)(det T). In¬ 
terchanging the roles of 5 and T, we have det(T5) = (detT) (det5). Be¬ 
cause multiplication of elements of F is commutative, the last equation 
can be rewritten as det(T5) = (det5)(det T), completing the proof. ■ 

yoCume 


We proved the basic results of linear algebra before introducing de¬ 
terminants in this final chapter. Though determinants have value as a 











Suppose V is a real inner-product space and T e £(V) is invertible. 






















)lume is intimately con- 
this section we will rely 
rigorous development, 
tie linear algebra parts 
Dlume will be correct- 
id into formally correct 


of T(Q) in terms of T 
lple example. Suppose 




















hat T changes volumes by a factor of det T. 
e X(R n ) is an arbitrary operator. By the polar de- 
there is an isometry S e £(V) such that 

t = sVt^t. 

) =S(VT*T(n)). Thus 
meT(Q) = volume 5 (Vr*T(Q)) 

=volume (Q) 

=(det Vr*T) (volume Q) 

=I det TI (volume Q), 









getting 





om Q to R. The partial derivative of aj 
mate is denoted Dkdj. Evaluating this 
e Q gives Dk(Tj(x). If a is differentiable 















contains Dk(Tj(x) in row j, column k (we will not prove this). In other 
words, 

" Diai(x) ... D n ai(x) 

10.39 M{a r {x)) = : : ■ 

Dia n {x) ... D n a n (x) 

Suppose that a is differentiable at each point of Q and that a is 
injective on Q. Let / be a real-valued function defined on cr(Q). Let 
x gQ. and let r be a small subset of Q containing x. As we noted above, 

volume a (r) « volume (C7’(x))(r )， 

where the symbol « means “approximately equal to”. Using 10.38, this 
becomes 

volume cj(r) « | det a f {x) \ (volume T) . 

Let y = a(x). Multiply the left side of the equation above by/(y) and 
the right side by f(a(x)) (because y = a{x), these two quantities are 
equal), getting 

10.40 /(y) volume cr(r) « f(a(x))\deta f (x)\(yo\umeY). 

Now divide Q into many small pieces and add the corresponding ver¬ 
sions of 10.40, getting 


10.41 


'o ■⑼ 












coordinates. 














Suppose T 


Someone t 
of T. With 


Al, . . . , A n Ut ； LHC ClgCUVcUUCS J , ICpecUCU 

plicity. Suppose 

^1,1 ■■- ^l,n 

U nf i … Ci ntn 

is the matrix of T with respect to some orthonormal basis of V. 
Prove that 

n n 

|A!| 2 + ■ - - + |A n | 2 < X S l^'.fcl 2 - 

k=lj=l 

18. Suppose V is an inner-product space. Prove that 
(S ， T) = trace(5T*) 


defines an inner 


X(V )， 







SymBoC Index 




Sheldon Axler 


Linear Algebra 
Done Right 

Second Edition 



Springer 



block upper-triangular matrix. 



Euclidean inner product, 


Index 


100 


field, 3 

finite dimensional, 22 


linear dependence lemma, 
25 

linear functional, 117 
linear map, 38 
linear span, 22 
linear subspace, 13 
linear transformation, 38 


orthonormal, 1 
orthonormal b< 






